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Abstract 

Algorithms distinguishing jets originating from b quarks from other jet flavors are important tools in the 
physics program of the DO experiment at the Fermilab Tevatron pp collider. This article describes the 
methods that have been used to identify &-quark jets, exploiting in particular the long lifetimes of fe-fiavored 
hadrons, and the calibration of the performance of these algorithms based on collider data. 
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1. Introduction 

The bottom quark occupies a special place among the fundamental fermions: on the one hand, its mass 
(of the order of 5 GeV [1]) is substantially larger than that of the (next lightest) charm quark. On the 
other hand, it is light enough to be produced copiously at present-day high energy colliders. 

In particular, unlike the top quark, the bottom quark is lighter than the W boson, preventing decays 
to on-shell W bosons. As a result, it lives long enough for hadronization to occur before its decay. The 
average lifetime of 6-flavored hadrons (referred to as b hadrons in the following) has been measured to be 
about 1.5 ps [l[: this is sufficiently long for b hadrons, even of moderate momentum, to travel distances of 
the order of at least a mm. Combined with the relatively large mass of b hadrons, the use of precise tracking 
information allows the detection of the presence of b hadrons through their charged decay products. In 
addition, b hadron decays often lead to the production of high momentum leptons; especially at hadron 
colliders, the observation of such leptons provides easy access to samples with enhanced 6-jet content. The 
identification of jets originating from the hadronization of bottom quarks (referred to as 6-jet identification 
or 6-tagging in the following) in the DO experiment is the subject of this publication. 

1.1. The upgraded DO detector 

The DO experiment is one of the two experiments operating at the Tevatronpp Collider at Fermilab. After 
a successful Tevatron Run I, which led to the discovery of the top quark 0, the Tevatron was upgraded 
to provide both a higher center-of-mass energy (from 1.8 TeV to 1.96 TeV) and a significant increase in 
luminosity. Run II started in 2001, and the Tevatron delivered 1.6 fb _1 of integrated luminosity to the 
experiments by March 2006, at which time another detector upgrade was commissioned. This publication 
refers to the Run II data taken before March 2006, commonly denoted as the Run Ha period. 

To cope with the increased luminosity and decreased bunch spacing (from 3.6 /is to 396 ns) in Run II, 
the DO detector also underwent a significant upgrade, described in detail elsewhere 0. In particular, a 
2 T central solenoid was installed to provide an axial magnetic field used to measure the momentum of 
charged particles. Correspondingly, the existing tracking detectors were removed and replaced with two new 
detectors, shown in Fig. [TJ 

• the central fiber tracker (CFT), consisting of about 77,000 axial and small-angle stereo scintillating 
fibers arranged in eight concentric layers, and covering the pseudorapidity region |?7| < 1.70; 

• and the silicon microstrip tracker (SMT) [5j, a detector featuring 912 silicon strip sensor modules 
arranged in six barrel and sixteen disk structures, allowing tracking up to \rj\ < 3. Of particular 
interest is the innermost layer of SMT barrel sensors: its proximity to the beam line (at a radius of 
2.7 cm) results in a relatively small uncertainty in the extrapolation of tracks to the beam line, and 
hence in good vertex reconstruction capabilities. 



*The coordinate system used in this article is a cylindrical one with the z axis chosen along the proton beam direction, 
and with polar and azimuthal angles 8 and <f> (measured with respect to the selected primary vertex, as explained in Sec. 12. it . 

Pseudorapidity r] is defined as r] = — ln(tan#/2), and approximates rapidity y = ^ In ^ E—p Z ) ( ra pidity differences arc invariant 

under Lorentz transformations along the beam axis). 
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Figure 1: The Run Ha central tracking detectors. 

The ability to identify efficiently the b quarktQ in an event considerably broadens the range of physics 
topics that can be studied by the DO experiment in Run II. While the analysis leading to the observation of 
the top quark by DO in Run I employed only semimuonic decays b — > /iX, the use of lifetime tagging allows for 
a more precise determination of the top quark properties (see e.g. Q). The search for electroweak production 
of top quarks (pp tb + X,tqb + X) relies heavily as well on the efficiency to identify b jets and reject 
light and c jets, as demonstrated in the recent observation of this production process Q- The search for the 
standard model Higgs boson also depends on 6-jet identification: a relatively light (mjj < 135 GeV) Higgs 
boson will decay predominantly to bb quark pairs. Finally, various regions of the minimal supcrsymmctric 
standard model parameter space lead to final states containing b quarks (from gluino or stop quark decays) , 
and efficient b tagging greatly increases the sensitivity of the search for those final states. 

This article is subdivided as follows. Section [5] describes the objects that serve as input to the ^-tagging 
algorithms. Section [3] introduces the steps taken before applying the tagging algorithms proper. Sections El 
[5j and |5] describe the basic ways in which lifetime-correlated variables are extracted. Section [7] describes 
the combination of these variables in an artificial neural network to obtain an optimal tagging performance. 
Finally, Sections [5] and |H] detail how Tevatron data are used to calibrate the performance of the resulting 
tagging algorithm. 

2. Object Reconstruction 

Besides the charged particle tracks, which are reconstructed from hits (clustered energy deposits) in 
the CFT and SMT detectors, the input for lifetime identification of &-quark jets consists of two kinds of 
reconstructed objects: 

• primary vertices, which are built from two or more charged particle tracks that originate from a 
common point in space; 

• hadron jets, which are reconstructed primarily from their energy deposition in the calorimeter. 
These objects are described below. 



tin this article, charge conjugated states are implied as well. 
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2.1. Primary vertex reconstruction 

The knowledge of the pp interaction point or primary vertex of an interaction is important to provide 
the most precise reference point for the lifetime-based tagging algorithms described in subsequent sections. 
In addition, multiple interactions can occur during a single bunch crossing. It is therefore necessary to select 
the primary vertex associated to the interaction of interest. The reconstruction and identification of the 
primary vertex at DO consists of the following steps: (i) track selection; (ii) vertex fitting using a Kalman 
filter algorithm to obtain a list of candidate vertices; (iii) a second vertex fitting iteration using an adaptive 
algorithm to reduce the effect of outlier tracks; and (iv) primary vertex selection. 

In the first stage, tracks are selected if their momentum component in the plane perpendicular to the 
beam line, pr, exceeds 0.5 GeV, and they have two or more hits in the SMT (counted as the number of hit 
ladders or disks) if the track is within the SMT geometric acceptance as measured in the (77, z) plane. The 
selected tracks are then clustered along the z direction in 2 cm regions to separate groups of tracks coining 
from different interactions, as evidenced by the tracks' z coordinates at their distance of closest approach 
to the beam line. 

In the second stage, the tracks in each of the z clusters are used to reconstruct a vertex. This is done in 
two passes. In the first pass, the selected tracks in each cluster are fitted to a common vertex. A Kalman 
filter vertex fitter is used for this step where tracks with the highest \ 2 contribution to the vertex are 
removed in turn, until the total vertex x 2 P er degree of freedom is less than 10. In the second pass, the 
track selection in each z cluster is refined based on the track's distance of closest approach in the transverse 
plane, d, to the vertex position computed in the first pass, as well on its uncertainty o~d'- only tracks with 
impact parameter significance Sd = d/o~d satisfying \Sd\ < 5 are retained. 

Once the outliers with respect to the beam position have been removed from the selected tracks, an 
adaptive vertex algorithm [8, 9] is used to fit the selected tracks to a common vertex in each cluster. This 
algorithm differs from the Kalman filter vertex fitter in that all tracks remaining after the Kalman filter 
selection procedure are allowed to contribute to the final vertex fit instead of rejecting those tracks whose x 2 
contribution to the vertex fit exceeds a certain value. It is especially suited to reducing the contribution of 
distant tracks to the vertex fit, thus obtaining a better separation between primary and secondary vertices. 
The algorithm proceeds in three iterative stages: (i) the track candidates in a z-cluster are fitted using a 
Kalman filter; (ii) each track is weighted according to its \ 2 contribution to the vertex found in the previous 
step, and if the weight is less than 10~ 6 , the track is eliminated; and (iii) steps (i) and (ii) are repeated 
until cither the weights converge (the maximum change in track weights between consecutive iterations is 
less than 10~ 3 ) or more than 100 iterations have been performed. 

The weights are adapted in each iteration: a track i associated with a small weight in one iteration will 
affect the weights of all other tracks in the next iteration because they are derived with respect to the new 
vertex position obtained with a down- weighted contribution from track i. The tracks are weighted in step 
ii) according to their x 2 contribution to the vertex by a sigmoidal function: 

Wi = f v 2_ v s wrr' W 

1 + e (Xi x cutoff )/2J 

Here, xf is the x 2 contribution of track i to the vertex; Xcutoff * s * ne X 2 value at which the weight function 
drops to 0.5; and T, like the temperature in the Fermi function in statistical thermodynamics, is a parameter 
that controls the sharpness of the function. Figure [2] shows the weight function used at DO with Xcutoff = ^ 
andT = l. 

Once all possible vertices in the event are accurately reconstructed, the fourth and final step consists of 
selecting which of the fitted vertices is the result of the hard scatter interaction. The hard scatter vertex is 
distinguished from soft-interaction vertices by the higher average pr of its tracks, as shown in Fig. [3] The 
probability V^^{pt) that the observed pr of a given track is compatible with the track originating from a 
soft interaction is computed as 

™ ~ j; .n P > r u ¥r - (2) 
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Track % 2 Contribution to the Vertex 

Figure 2: Weight given to tracks as a function of the \ 2 distance to the vertex for different T parameters. In the adaptive 
fitting, all tracks contribute to the fit when T > 0. The value used in the reconstruction algorithm is T = 1. 

Here, !F{jpt) is the px distribution of tracks attached to soft interaction vertices in Z — >• /i + fi~ candidate 
events requesting that they be separated from the Z — > fi + /i~ interaction vertex by more than 10 cm. 
J-(pr) is depicted in Fig. [3^,. Only tracks with pt > p™ n = 0.5 GeV are used in this calculation. For each 
reconstructed vertex, the probability that it is consistent with a minimum bias interaction is formed as 

v& = n. £ LJ ?F L with ( 3 ) 

W trk 

n = f[v^(p T ,i), 

8=1 

where iV tr k is the number of tracks attached to the vertex; a motivation for this expression is provided in 
Sec. [5] The selected primary vertex is the one with the lowest the resulting Vyfe distribution is shown 

in I'ig.dJ). 

The reconstruction and identification efficiency in data is between 97% and 100% for primary vertices 
reconstructed up to \z\ — 100 cm, as measured on the Z — ¥ candidate event sample. For multijet 

events, the position resolution of the selected primary vertex in the transverse plane can be determined by 
subtracting the known beam width quadratically from the width of the observed vertex position distribution. 
This resolution improves with increasing track multiplicity and is better than the beam width (around 30 /im) 
for events with at least 10 tracks attached to the primary vertex. 

2.2. Jet reconstruction and calibration 

The vast majority of data analyses in DO make use of so-called cone jets, which collect all calorimeter 
energy deposits within a fixed angular distance 1Z = \J (Ay) 2 + (A0) 2 in (y,<p) space. Specifically, the 
cone jet reconstruction algorithm used within DO is the Run II cone jet algorithm [lOj . This algorithm is 
insensitive to the presence of soft or collinear radiation off partons, thus allowing for detailed comparisons 
of jet distributions in the DO data with theoretical predictions. The cone radii used in analyses in DO are 
1Z = 0.5 and 1Z = 0.7, but for most high pt physics the TZ = 0.5 cone jets are used. It is only these jets that 
are described in this article. 

As the jets are reconstructed on the basis of calorimetric information, comparisons between the data and 
jets simulated using Monte Carlo (MC) methods require corrections for various effects: 

• energy deposits not from the hard interaction (either from the remnant of the original pp system or 
from additional soft interactions or noise); 
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Figure 3: (a): The track pj- spectrum for soft-interaction vertices in the DO data selected as described in the text. For 
comparison, the corresponding distribution is also shown for the primary vertices in one data run selected on the basis of 
their T'jjjj values, as also described in the text. The soft-interaction vertex distribution is ^F(j>t) m Eq. [2] (b): The "P^B 
distribution for soft-interaction vertices and multijet (hard scatter) primary vertices. 



the (energy dependent) calorimeter response to incident high-energy particles; 

the net energy flow through the jet cone because of the finite size of showers in the calorimeter and 
the bending of charged particle trajectories by the magnetic field. 



The topic of the determination of the jet energy scale (JES) is described in a separate paper The 
resulting JES, by itself, does not yet account for neutrinos from decays of b- or c-flavored hadrons escaping 
undetected. While such additional corrections may be important for physics analyses, b tagging is only 
sensitive to it because the tagging performance obtained in data (Sec. H]) is parametrized in terms of jet 
Pt (and 77) and applied to simulated jets. For this purpose, corrections for undetected neutrinos (and for 
the energy not deposited in the calorimeter, in the case of muons associated with jets) do not need to be 
applied, provided that data and simulated jets are treated identically. 



3. Tagging Prerequisites 

In order to evaluate the performance of the 6-tagging algorithms described in the following sections, it 
is necessary to define properly the meaning of "6 jet" . Also, several steps are carried out before proceeding 
with the tagging proper. These steps are discussed below. 

3.1. Flavor assignment in simulated events 

The 6-tagging algorithms used within the DO experiment are jet based rather than event based. This 
choice makes sense especially for high-luminosity hadron colliders, where pile-up (overlapping electronics 
signals) from previous interactions, as well as multiple interactions in the same bunch crossing, may lead to 
other reconstructed tracks and jets in the event besides those of the "interesting" high-py interaction. 

However, this choice introduces an ambiguity for jets in simulated events: In order to estimate the 
performance of a 6-tagging algorithm, it is first necessary to specify precisely how a jet's flavor is determined. 
The following choice has been made: 

• if at the particle level {i.e., after the hadronization of the partonic final state), a hadron containing a 
b quark (denoted in the following as b hadron) is found within a 1Z — 0.5 radius of the jet direction^, 
the jet is considered to be a b jet; 



t Apart from the jet cone definition, as discussed in Sec. 12.21 angular distances in this article are determined in (77, <j>) space. 
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• if no b hadron is found, but a hadron containing a c quark (henceforth denoted as a c hadron) is found 
instead, the jet is considered to be a c jet; 

• if no c hadron is found either, the jet is considered to be a light-flavor jet. 

This choice is preferred over the association with a parton level b or c quark, as in the latter case, parton 
showering may lead to a large distance A1Z between the original quark direction and that of the corresponding 
jet(s). 

3.2. Taggability 

The jet tagging algorithms described in the following sections are based entirely on tracking and vertexing 
of charged particles. Therefore, a very basic requirement is that there should be charged particle tracks 
associated with the (calorimeter) jet. Rather than incorporating such basic requirements in the tagging 
algorithms themselves, they are implemented as a separate step. 

The reason for this is that the tagging algorithm's performance must be evaluated on the data, as 
detailed in Sec. [8j It is parametrized in terms of the jet kinematics (j>t and |7y|). This parametrization 
presupposes that there are no further dependences. However, the interaction region at the DO detector is 
quite long, a z s=s 25 cm, and the detector acceptance affects the track reconstruction efficiency dependence 
on i] differently for different values of the interaction point's z coordinate; hence the above parametrization 
is only possible once this z dependence is accounted for. 

The requirement for a jet to be taggable, i.e., for it to be considered for further application of the tagging 
algorithms, is that it should be within A1Z = 0.5 from a so-called track jet. Track jets are reconstructed 
starting from tracks having at least one hit in the SMT, a distance to the selected primary vertex less than 
2 mm in the transverse plane and less than 4 mm in the z direction, and px > 0.5 GeV. Starting with 
"seed" tracks having px > 1 GeV, the Snowmass jet algorithm [12j is used to cluster the tracks within 
cones of radius 1Z = 0.5. 

As an example, Fig.|4]shows the taggability for the sample used to determine the 6-jet efficiency in Sec.[8j 
The taggability is determined as a function of both the jet kinematics (px and rf) and the z coordinate of 
the selected primary vertex, as detailed above. In detail, the coordinate used in Fig.^Jis z' = \z\ -sign^- z), 
as this variable makes optimal use of the geometrical correlations between 77 and z. In different z' regions, as 
shown in Fig.Hk,, only \rj\ and pt are used as further independent variables. The taggability parametrization 
in each z' bin is two dimensional, even though only the projection onto pt is shown. The tagging algorithms 
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Figure 4: (a): taggability as a function of z' = \z\ ■ sign(?7 ■ z). The vertical lines denote the boundaries chosen for the 
parametrization in px and \rj\. (b): taggability as a function of jet px, in different bins of 2'. The curves for the two central 
bins are very similar and have been combined. 



discussed in subsequent chapters are applied only to jets retained by the taggability criterion, and also the 
quoted performances do not account for the inefficiency incurred by this criterion. 
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3.3. V° rejection 

By construction, the lifetime tagging algorithms assume that any measurable lifetime is indicative of 
heavy flavor jets. However, light strange hadrons also decay weakly, with long lifetimes. Particularly 
pernicious backgrounds arise from K$ and A, as their lifetimes (90 ps and 263 ps, respectively) do not differ 
vastly from those of b hadrons. In addition, 7 — > e + e~ conversions may occur in the detector material at 
large distances from the beam line. 

Ks and A candidates, commonly denoted as V°s, are identified through two oppositely charged tracks 
satisfying the following criteria: 

• The significance of the distance of closest approach (DCA) to the selected primary vertex in the 
transverse plane, Sd (see Sec. 12. of both tracks must satisfy \Sd\ > 3. 

• The tracks' z coordinates at the point of closest approach in the transverse plane must be displaced 
from the primary vertex less than 1 cm, to suppress misreconstructed tracks. 

• The resulting V° candidate must have a distance of closest approach to the primary vertex of less than 
200 //m. This requirement is intended to select only those V° candidates originating from the primary 
vertex, while candidates originating from heavy flavor decays may be taken into account during the 
tagging. 

• The reconstructed mass should satisfy 472 MeV < m < 516 MeV for Ks candidates, and 1108 MeV < 
m < 1122 MeV for A candidates (in the latter case, the higher pr tracks are considered to be protons; 
the other tracks, or both tracks in the case of Ks reconstruction, are assumed to be charged pions). 
The invariant mass distributions of reconstructed Ks and A candidates are shown in Fig. [SJ 




M(jtji) (GeV) M(pjr) (GeV) 

Figure 5: Reconstructed Kg (a) and A (b) mass peaks. 

The V° finding efficiency depends on the transverse momentum of the V° decay products, as well as on 
the position of the decay vertex inside the tracker volume. As an example, Fig. [6] shows the Ks finding 
efficiency as a function of the transverse position of the decay vertex for the cases when both or at least one 
of the decay pions have a transverse momentum greater than 1 GeV. This efficiency has been determined 
using simulated multijet events. 

Photon conversions are most easily recognized by the fact that the opening angle between the electron 
and positron is negligibly small. In the plane perpendicular to the beam line, this is exploited by requiring 
that the tracks be less than 30 /xm apart at the location where their trajectories are parallel to each other. 
In addition, they must again be oppositely charged, and their invariant mass is required to be less than 
25 MeV. Since conversions happen inside material, the locations of their vertices reflect the distribution of 
material inside the detector, as illustrated by Fig. the locations of the SMT barrel ladders and the disks 
are clearly visible in the distribution of the radial and z coordinates, respectively, of reconstructed photon 
conversions. 
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Figure 6: The Kg — > tt+tt finding efficiency as a function of the transverse position of the decay vertex as determined in 
simulated multijet events. 
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Figure 7: Reconstructed radial (a) and z coordinate (b) of candidate conversion vertices. The peaks correspond to the radial 
positions of the SMT barrels and z positions of the SMT disk structures, respectively. 



4. The Secondary Vertex Tagger 

The vast majority of 6-hadron decays give rise to multiple charged particles emanating from the 6-hadron's 
decay point. The most intuitive tagging method is therefore to attempt to reconstruct this decay point 
explicitly and to require the presence of a displaced or secondary vertex. The requirement that a number 
of tracks all be extrapolated to the same point in three dimensions is expected to lead to an algorithm, the 
Secondary Vertex Tagger (SVT) [l3j], which is robust even in the presence of misreconstructed tracks. 

After the identification and selection of the primary interaction vertex, the reconstruction of secondary 
vertices starts from the track jet associated with each (taggable) calorimeter jet (see Sec. I3.2p . The tracks 
considered are those associated with the track jet, subject to additional selection criteria: they should have 
at least two SMT hits, transverse momentum exceeding 0.5 GeV, transverse impact parameter with respect 
to the primary vertex \d\ < 1.5 mm, and a separation in the z direction between the point of closest approach 
to the beam line and the primary vertex less than 4 mm. Subsequently, tracks associated with identified V° 
vertices (see Sec. I3.3fl are removed from consideration. All remaining tracks are used in a so-called build-up 
vertex finding algorithm. In detail, the algorithm consists of the following steps: 

1. Tracks within track jets with large transverse impact parameter significance, \Sd\ > 3, are selected. 

2. Vertices are reconstructed from all pairs of tracks using a Kalman vertex fitting technique 1J|, and 
are retained if the vertex fit yields a goodness-of-fit % 2 < Xmax — 100- 
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3. Additional tracks pointing to these seed vertices are added one by one, according to the resulting \ 2 
contribution to the vertex fit. The combination yielding the smallest increase in fit \ 2 is retained. 

4. This procedure is repeated until the increase in fit x 2 exceeds a set maximum, Ax 2 lax = 15, or the 
total fit x 2 exceeds xLax- 

5. The resulting vertex is selected if in addition, the angle £ between the reconstructed momentum of the 
displaced vertex (computed as the sum of the constituent tracks' momenta) and the direction from 
the primary to the displaced vertex (in the transverse plane) satisfies cosC > 0.9, and the vertex decay 
length in the transverse direction L xy < 2.6 cm. 

6. Many displaced vertex candidates may result, with individual tracks possibly contributing to multiple 
candidates. Duplicate displaced vertex candidates are removed until no two candidates are associated 
with identical sets of tracks. 

7. Secondary vertices are associated with the nearest calorimeter jets if A7£(vertex, jet) < 0.5. Here, the 
vertex direction is computed as the difference of the secondary and primary vertex positions. 

Figure [5] shows distributions that characterize the pro per ties of &-jet and light-flavor secondary vertices 
reconstructed in ti events simulated using the Alpgen [15j event generator: the multiplicity of vertices 
found in a track jet (-/V vtx ), the number of tracks associated with the vertex (A^ r k), the mass of the vertex 
(wivtx) calculated as the invariant mass of all track four-momentum vectors assuming that all particles are 
pions, and the largest decay length significance, S xy = L xy /a{L xy ), where a{L xy ) represents the uncertainty 
on L xy . 
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Figure 8: Properties of the secondary vertices for tagged b and light jets in tt MC: multiplicity of vertices found in a track jet 
(a), the number of tracks associated with the vertex (b), the mass of the vertex (c), and the decay length significance (d). 

A calorimeter jet is tagged as a b jet if it has at least one secondary vertex with decay length significance 
greater than a nominal value. In order to characterize the performance of this tagging algorithm, various 
versions have been studied, differing in the requirements used in the selection procedure. 

The algorithm as described above is referred to as the Loose algorithm. In Sec. I7.1.1I a high-efficiency 
version, labeled SuperLoose, with a correspondingly high tagging rate for light-flavor jets is also used. The 
high efficiency is obtained by placing no requirement on the impact parameter significance of the tracks used 
to reconstruct the secondary vertices. 
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5. The Jet Lifetime Probability Tagger 

The impact parameters of all tracks associated with a calorimeter jet can be combined into a single 
variable, the Jet Lifetime Probability (JLIP) 'Pjlip 0, E3, Eli which can be interpreted as the confidence 
level that all tracks in a jet originate from the (selected) primary interaction point. Jets from light quark 
fragmentation are expected to present a uniform 'Pjlip distribution between and 1, whereas jets from c 
and b quarks will exhibit a peak at a very low Pjlip value. It is thus easy to select &-quark jets by requiring 
this probability not to exceed a given threshold, the value of which depends on the signal efficiency and 
background rejection desired for a given physics analysis. 

Using the impact parameters of reconstructed tracks allows control of their resolution by using data, 
minimizing the need for simulated samples. For this purpose, the impact parameter is signed by using 
the coordinates of the track at the point of closest approach to the fitted primary vertex, d, and the jet 
momentum vector, pV(jet). In the plane transverse to the beam axis, the distance of closest approach to the 
primary vertex (d = \d\) is given the same sign as the scalar product d ■ pr (jet). The signed d distribution 
for tracks from light quark fragmentation is almost symmetric, whereas the distribution for tracks from b- 
hadron decay exhibits a long tail at positive values. Therefore, provided that the sign of d-prQet) is correctly 
determined, the negative part of the d distribution allows the d resolution function to be parametrized. 

5.1. Calibration of the impact parameter resolution 

In order to tune the impact parameter uncertainty, er^, computed from the track fit, the following variable 
is introduced: p S cat = p(sin#) 3 / 2 , where p is the particle momentum and 6 its polar angle relative to the 
beam axis. In the plane transverse to the beam axis, the smearing due to multiple scattering is inversely 
proportional to pt — psm9 and proportional to the square root of the distance traveled by the track. 
Assuming the detector material to be distributed along cylinders aligned with the beam, this distance is also 
inversely proportional to sin#. The d distributions are then computed in sixteen different j? sca t intervals. 

In order to parametrize the d resolution, five track categories are considered: 

• at most six CFT hits (each doublet layer may give rise to one hit) and at least one SMT hit in the 
innermost layer, for tracks with |^| > 1.6; 

• at least seven CFT hits and 1, 2, 3, or 4 SMT superlayer hits (the eight SMT barrel layers are grouped 
into four superlayers; the two neighboring layers constituting one superlayer provide full azimuthal 
acceptance). 

The first category includes forward tracks outside the full CFT acceptance, the latter are central tracks with 
different numbers of SMT hits. 

The calibration is performed starting from the impact parameter significance Sd — d/ ad, with the sign 
of d determined as described above. In each p sca t interval and for each category, the Sd distribution is fitted 
using a Gaussian function to describe the d resolution. The fitted pull values (the variance of the Gaussian) 
are presented in Fig. [H] for multijet data. The same calibration procedure is carried out for simulated QCD 
events, and the corresponding results are also shown in Fig. |H1 The superimposed curves are empirical 
parametrizations of the data and the simulation. The pull values are found to go up to 1.2 in data, while 
they are closer to 1 in the simulation. 

As the impact parameter resolution may be sensitive to the primary vertex resolution, the d significance 
is also fitted separately for events with different numbers of tracks, Npy, attached to the primary vertex. 
As shown in Fig. [TU] for multijet data and simulated QCD events, the pull value increases significantly with 
Apv (here the p SC nt and category dependence of the pull value is already corrected for). 

Then for each track, the impact parameter uncertainty and associated significance are corrected according 
to the track's measured p S c&t value, category i and number of tracks Apy attached to the primary vertex: 

a d °" = pullfcw,*, N PV ) -a d (4) 
S c d °" = 5/pull(p scat) i,^Pv) 
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Figure 9: Fitted pull values of the track impact parameters as a function of p sca t = p(sin6) 3 / 2 , for different track categories. 
The superimposed solid (dashed) curves are empirical parametrizations of the data (QCD MC). 



The corrected tr™ rr resolutions are shown in Fig. [TT] for multijet data and simulated QCD events. In the 
approximation of small angles [l[ , they can be parametrized as 

(5) 



d p (sin 0) 3 / 2 ' 

where a describes multiple scattering effects, and b is the asymptotic resolution (which is sensitive to the 
primary vertex resolution, detector alignment, SMT intrinsic resolution, etc.); the symbol © denotes their 
summing in quadrature. This parametrization is superimposed in Fig. 111! 

The impact parameter resolution is observed to be better in the simulation than in data. For forward 
tracks with fewer than 7 CFT hits and with p S cat > 10 GeV, the multiple-scattering small angles approxima- 
tion is no longer valid. This can lead to misreconstructed track momenta, and consequently the measured 
d resolution is larger than its asymptotic fitted value (see Fig. ITTa). The parametrization is also imperfect 
at the lowest p S cat values, although the reason for this has not been ascertained. 



5.2. Lifetime probability 

The data are used to calibrate the impact parameter significance. For multijet data or QCD MC, the 
negative part of the d significance distribution, denoted impact parameter resolution function TZ(S^ mr ), 
is parametrized as the sum of four Gaussian functions, as illustrated in Fig. 1121 After removing tracks 
originating from V° candidates (see Sec. [3]), the track categories used in the previous section are extended 
to take into account the number of SMT and CFT hits, |^|, fit x 2 , and pr values of the tracks, as listed in 
Table [T] The category ranges are adjusted to describe as much as possible geometric and tracking effects. 
This further refinement results in 29 track categories. 
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Figure 10: Fitted pull values of the track impact parameters, corrected here for their p sca t and category dependence, as 
a function of the number of tracks attached to the primary vertex, combining all track categories. The superimposed solid 
(dashed) curves are empirical paramctrizations of the data (QCD MC). 
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Table 1: Track categories used for the parametrization of the impact parameter resolution functions. The "> 1 inner layer hit" 
line denotes the requirement of at least one hit in the innermost SMT superlayer. 



For tracks with positive d, the parametrized resolution function is then converted into a probability for 
this track to originate from the primary interaction point 

r ]Sd 1 ii{s)ds 

n^sm = J 7 — ^ • (6) 

The corresponding track probabilities are shown in Fig. ll3l for multijet data and simulated jets of different 
flavors, and for positive and negative d values. Tracks with negative d values in multijet data and in simulated 
light quark jets are used to define the d resolution functions, thus ensuring uniform 7 7 trk('5™ l r < 0) probability 
distributions. For positive d, a significant peak at low Vtrk(S^°" > 0) probability is present in simulated 
c and b jets. In multijet data, a peak is also observed at low values which is partly due to the presence of 
V°s (which are not all removed, see Sec. 13. 3[) . but also to tracks from charm and 6-hadron decays. Note 
that for simulated c and b jets, a slight peak remains at negative d due to a flip of the d sign, mainly due 
to tracks very close to the jet axis direction (see also Sec. In addition, a small dip is observed at low 
'Ptrk('5™ lr < 0). This dip is due to the fact that, like for the data, the resolution function for the simulated 
sample has been derived without removal of the heavy flavor component. 

Finally, the selected iVtrk tracks with positive d significance are used to compute the jet probability Vjlip 

as 



r>u, = PTu, = n E'^^ (7) 



3=0 
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Figure 11: Corrected track impact parameter resolution as a function of p sca t, for different track categories. The solid (dashed) 
curve is a fit to the data (QCD MC). 

W trk 

with n = f[v tIk (sz ir )- 

i=i 

For tracks with negative d, a jet probability 7-jl IP can be computed analogously. 

By construction, if the 7-trk are uniformly distributed and uncorrelated, "Pjlip will also be uniformly 
distributed, independent of N tr k QM- Therefore, apart from wrongly assigned negative d in the case of 
tracks originating from the decay of long-lived particles, and from any correlations that are induced by the 
common primary vertex (which is reconstructed from the tracks under consideration, among others), the 
resulting T'jlip distribution is indeed expected to be uniform for negative d tracks in multijet data. These 
distributions are shown in Fig. [14] for multijet data and simulated jets of different flavors, and for positive 
and negative d values. Applying this tagging algorithm simply entails requiring that a jet's Pjlip value does 
not exceed some given maximum value. 

6. The Counting Signed Impact Parameter Tagger 

In this method [201 ]. as in Sec. [S] there is no attempt to use reconstructed secondary vertices. Instead, 
the signed impact parameter significance Sd is calculated for all good tracks located within a 1Z = 0.5 cone 
around the jet axis. For the present purpose, the definition of a good track is as follows: 

• the track should be associated with the hard interaction (the difference between the z coordinates of 
the DCA point and the primary vertex should be less than 1 cm); 

• the track DCA must not be too large, \d\ < 2 mm; 

• the track transverse momentum should satisfy pr > 1 GeV; 
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Figure 12: The impact parameter resolution functions, shown here for one of the 29 track categories described in Table [T] arc 
parametrized as a sum of four Gaussian functions. 

• the track fit should be of good quality: its \ 2 P er degree of freedom Xdof should satisfy Xdoi < 9; 

• tracks with Xdof < 3 are required to have at least 2 SMT hits; 

• tracks with 3 < Xdof < 9 are required to have cither 4 SMT hits and at least 13 CFT hits, or at least 
5 SMT hits and either or at least 11 CFT hits. 

The effect of the last criterion is to impose more stringent quality requirements on tracks having |ry| « 1.5. 
This is done as this region suffers from a higher fake track rate, leading to track candidates with a small 
but non-zero number of CFT hits. Finally, tracks originating from a V° candidate, as detailed in Sec. [31 arc 
also excluded. 

A jet is considered to be tagged by the Counting Signed Impact Parameter (CSIP) tagger if there are 
at least two good tracks with Sd/a > 3 or at least three good tracks with Sd/a > 2, where a is a scaling 
parameter. The choice of a determines the operating point (6-tagging efficiency and mistag rate) of the 
algorithm. Alternatively, if a jet has at least two good tracks with positive Sd, then the minimum value of 
a at which there are at least two good tracks with Sd/a > 3 or at least three good tracks with Sd/a > 2 can 
be used as a continuous output variable of the tagger. Here, a is set to 1.2, as suggested by optimization 
studies using simulated data. 

In the actual implementation of the algorithm, there is an additional condition related to the fact that 
the sign of Sd cannot be determined accurately for tracks that are very close to the jet axis, as illustrated by 
Fig. [15] The criterion of closeness is empirically chosen as the difference in the azimuthal angle between the 
track and jet directions, Aip, being less than Aipo = 20 mrad. This value has been optimized by comparing 
algorithm performance for various values of A(fQ. Four categories of tracks are counted separately: 

• tracks with Sd/a > 3, \A(p\ > A(po ( "3<j-strong" tracks, their total number to be denoted as iV3 S ), 

• tracks with 2 < Sd/a < 3, \Aip\ > Aipo ( "2cr-strong" tracks, iVa s ), 

• tracks with |<5>d/a| > 3, \Aip\ < Aipo ("3cr-weak" tracks, A^), 

• tracks with 2 < |<S<j/a| < 3, \Aip\ < A(fo ("2er-weak" tracks, A^m). 

If CSIP is used as a stand-alone algorithm, the jet is considered tagged if A^s + Ns s + N^w + ^3^, > 3 and 
N28 + > 1, or Nss + Ns w > 2 and N^s > 1. In other words, in addition to the original tagging condition 
(at least two good tracks with Sd/a > 3 or at least three good tracks with Sd/a > 2), at least one of the 
tagging tracks is required to be strong. In the present implementation, the four numbers (A^g, iV2 S , A^™, 
N2w) are packed in a single variable which is used in the combined algorithm, as explained in Sec. [7] 
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Figure 13: Track probability ("Ptrk> see Eq. |6j distribution in multijet data (a) and QCD MC simulation of light-flavor (b), c 
(c), and b (d) jets, for positive (dark histograms) and negative (light histograms) d values. 



7. The Neural Network Tagger 

Artificial neural networks (see e.g. (2ll |) are modeled after the synaptic processes in the brain, and 
have proved to be a versatile machine learning approach to the general problem of separating samples of 
events characterized by many event variables. In particular, the potential of neural networks to exploit the 
correlations between variables, and the possibility to train the network to recognize such correlations, make 
their use in high energy physics attractive. 

The neural network (NN) tagger attempts to discriminate between b jets and other jet flavors by com- 
bining input variables from the SVT, JLIP, and CSIP tagging algorithms 22]. The NN implementation 
chosen is the TMultiLayerPerceptron from the ROOT [23[ framework. 

7.1. Optimization 

The following NN parameters were optimized: input variables (number and type) , NN structure, number 
of training epochs, and jet selection criteria. The choice of input variables is crucial for the performance of 
the NN and so was optimized first. 

The NN parameters were optimized by minimizing the light-flavor tagging efficiency or fake rate for 
fixed benchmark 6-tagging efficiencies. The optimization plots were produced from a high pt Alpgen tt 
sample and cross checked with a Pythia [24| bb sample to ensure there was no sample, pr, or MC generator 
dependence in the optimization. 

The NN was trained on simulated multijet light-flavor and bb samples. To avoid overtraining, i.e., a 
focus on features that are too event specific, the signal sample of 270 000 bb events and the background 
sample of 470 000 light-flavor events were each split in half, with one half used as the training sample and 
the other half to evaluate the network's performance. It was verified that for the parameter settings as 
described below, no overtraining occurs. 
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Figure 14: Jet probability fPjLIp) distribution in multijet data (a) and QCD MC simulation of light-flavor (b), c (c), and b 
(d) jets, for positive (dark histograms) and negative (light histograms) d values. 



7.1.1. Input variables 

Initially, a large number of lifetime-related variables were investigated (e.g. variables for different SVT 
versions, or various combinations of the numbers of tracks in the different CSIP track categories) . After a first 
assessment of their performance, nine input variables were selected for the final input variable optimization 
due to their good discrimination between b jets and light-flavor jets. Six of the variables are based on the 
secondary vertices reconstructed using the SVT algorithm. The remaining three summarize information 
from the JLIP and CSIP algorithms. The input variables are: 

SVT S xy : the decay length significance (the decay length in the transverse plane divided by its uncertainty) 
of the secondary vertex with respect to the primary vertex. 

SVT Xdof : the x 2 per degree of freedom of the secondary vertex fit. 

SVT iVtrk: the number of tracks used to reconstruct the secondary vertex. 

SVT m v tx! the mass of the secondary vertex. 

SVT iV v t x : the number of secondary vertices reconstructed in the jet. 

SVT A7Z: the distance in (n,<fi) space between the jet axis and the difference between the secondary and 
primary vertex positions. 

JLIP 'Pjlip: the "jet lifetime probability" computed in Sec. [3] 

JLIP T^RedJLip: JLIP "Pjlip re-calculated with the track with the highest significance removed from the 
calculation. 
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CSIP A/"csip: a combined variable based on the number of tracks with an impact parameter significance 
greater than an optimized value. This variable is discussed in more detail below. 

Since more than one secondary vertex can be found for each jet, vertex variables are ranked in order of 
the most powerful discriminator, the decay length significance (S xy ). The secondary vertex with the largest 
S xy in a jet is used to provide the input variables. If no secondary vertex is found, the SVT values are set 
to 0, apart from the SVT xiof w hich is set to 75 corresponding to the upper bound of Xdot values. 

The standard, Loose, implementation of the SVT algorithm requires a displaced vertex constructed from 
significantly displaced tracks. While such an approach helps to isolate a pure sample of heavy flavor decays, 
it typically results in a low efficiency. In the context of an NN optimization, this is undesirable as any 
vertex-related information is only available if a displaced vertex is found. For this reason, the SuperLoose 
SVT algorithm described in Sec. 2] is used: even if the vertex candidates it finds are a less pure sample, 
it finds significantly more displaced vertices and they provide additional discrimination between b jets and 
other flavors. Figure \W\ shows the efficiency, as a function of jet pr, for both algorithm choices for light-flavor 
and b jets. The NN tagger is found to perform best if information from both the SuperLoose and the Loose 
SVT algorithms is used: the JVtrk variable is taken from the latter, and all other SVT variables from the 
former. 
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Figure 16: Efficiencies of the SuperLoose (SVTgx, up triangles) and Loose (SVT^, down triangles) SVT taggers for bb (a) and 
light-flavor (b) MC jets. 



The CSIP jVcsip variable is based on the four CSIP variables N^ s , N^a, N^ w , and N2 W described in Sec. [5J 
Neural networks tend to perform best when provided with continuous values. Since the CSIP variables have 
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small integer values which are not very good as inputs, they are combined in one variable which brings the 
advantage of reducing the number of input variables, hence simplifying the NN: 



AW = 6 x A 3s + 4 x N 2s + 3 x N 3w + 2 x N 2w . (8) 

The weights were determined in an empirical manner to give optimum performance for this variable 
alone. 

First, the input variables were optimized. At this stage, the other NN parameters were set to the 
following values: NN structure N:2N:1, where N is the number of input variables; 500 training epochs; and 
selection criteria SVT SL S xy > 2 or CSIP A/"csip > 8 or JLIP 'Pjlip < 0.02. (The subscript SL serves to 
clarify that this variable is obtained from the SuperLoose SVT algorithm, as described above.) 

The input variables were optimized by first identifying the two most powerful input variables by testing 
every possible combination in a two-input NN, resulting in the choice of S xy and A/csip as initial variables. 
Starting with an initial NN with these n = 2 variables and a list of m variables, the remaining variables 
were then ranked in order of power using the following procedure: 

1. each of the m variables to be tested was added individually to the initial n variable NN, resulting in 
m NNs with n+1 variables each; 

2. the variable whose addition yielded the largest improvement in NN performance was identified. This 
variable was added permanently to the n variable NN; 

3. the above steps were repeated, testing each of the remaining m — 1 variables with the new n + 1 variable 
NN. 

The fake rate of each NN at a 70% 6-jet efficiency benchmark scenario was used to select the optimal 
variable. As a cross check, the procedure was repeated for signal efficiencies ranging from 50% to 75% in 
5% steps. The same set of variables was found to give the greatest reduction in fake rate in each case. As 
an example, two benchmark scenarios are shown in Fig. 1171 




Figure 17: Fake rate for fixed signal efficiencies of 75% (up triangles, left axis) and 50% (down triangles, right axis) as a 
function of additional NN variables. The NN variables were added to the NN in order of performance. The lines are intended 
to guide the eye only. The errors are statistical only. 

The NNs with seven to nine variables have the best performance. Therefore the seven variable NN is 
chosen as the optimal solution, keeping the NN as simple as reasonably possible. The final selected variables, 
ranked in order of performance, are SVT SL S xy , CSIP Acsip, JLIP Pjlip, SVT S l Xd«rf. SVT ^ ^trk, SVT SL 
m v tx, and SVTg^ iV v tx- The distributions of these variables in simulated QCD samples are shown in Fig. 1181 

7.1.2. Number of training epochs and neural network structure 

The number of training epochs was varied from 50 up to 2000. For each of the benchmark scenarios the 
majority of the minimization is reached by ~ 400 epochs, with only small further improvement thereafter. 
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The number of hidden layers was set to one, as one layer should be sufficient to model any continuous 
function [25| and this minimizes CPU usage. The number of hidden nodes was optimized by varying their 
number from seven through thirty-four. Twenty-four was chosen as the optimal number of hidden nodes. 

7.1.3. Input selection criteria 

Another important attribute of the NN is the selection of the jets which are used to train the NN. A 
selection too loose can cause a loss of performance as the NN training is dominated by signal and background 
jets which could have been separated with a simple requirement, causing a loss of resolution. A selection 
which is too tight will cause a significant loss of b jets and therefore limit the maximum possible efficiency. 

The input selection criteria were optimized by considering each variable in turn, starting with the most 
important variable, SVT sl S xy , then JLIP "Pjlip, and finally CSIP A/csip (at this stage, a requirement on 
JLIP Vjlip performs better than one on CSIP A/csip)- The optimal values were chosen as SVTsl S xy > 2.5, 
JLIP Pjlip < 0.02, and CSIP A/csip > 8. The results for SVT SL S xy are shown in Fig. H (in this case, the 
requirement is fixed at an S xy value of 2.5 since for the loosest operating points, the performance degrades 
for even larger S xy values). 
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Figure 19: Fake rate for fixed signal efficiencies of 75% (a) and 50% (b) as a function of the SVTsl S xy requirement on the 
input jets. The lines arc intended to guide the eye only. 



7.1.4- Optimized NN parameters 

The optimized parameter values for the NN tagger are summarized in Table [5J 



Parameter 


Value 


NN structure 


7 input nodes:24 hidden nodes:l output node 


Input variables 
(performance ranked) 


(1) SVT SL S xy (2) CSIP Acsip (3) JLIP 7> JLIP (4) SVT SL x Lt 
(5) SVT L A trk (6) SVTsl m vtx (7) SVT SL A vtx 


Input selection criteria 

(failure results in NN output of 0) 


SVT SL S xy > 2.5 or JLIP P 3L1P < 0.02 or CSIP A/csip > 8 


Number of training epochs 


400 



Table 2: Optimized NN parameters. 



7.2. Performance 

The output from the optimized NN b tagger on bb and light-flavor simulated jets is shown in Fig. [2HI 
There is a significant separation between the signal and background samples. It should be noted that the 
light-flavor jets in the distribution have all passed the loose tagging input selection criteria listed in Table [5J 

The advantage of combining the input variables from several taggers in an NN is shown in Fig. [5TJ 
which compares the NN 6-tagging performance to the JLIP, SVT and CSIP taggers. There is a substantial 
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improvement, with relative efficiency increases of ss 20 - 50% for a fake rate of 0.2% and w 15% for a fake 
rate of 4%. The fake rate is reduced by a factor of between two or three for fixed signal efficiencies. 

The NN tagger performance in data is evaluated in the following sections for twelve operating points, 
corresponding to NN output discriminant threshold values ranging from 0.1 to 0.925. For illustrative pur- 
poses, detailed results will be provided for threshold values of 0.325 and 0.775, referred to as L2 and Tight, 
respectively. 

8. Efficiency Estimation 

The performance of the tagging algorithm cannot simply be inferred from simulated samples. Several 
effects cause differences between the data and these simulated samples: 

• Simulated hit resolutions, both in the CFT and in the SMT, have been tuned to reproduce those in 
the data. However, the tuning cannot be expected to be perfect as the observed resolutions in the data 
are also affected by poorly understood geometrical effects which are not modeled in the simulation. 

• A small but non-negligible fraction of the detector elements, in particular in the SMT, are disabled 
from time to time. 

These effects lead to different effective resolutions and efficiencies in data and simulated samples. A cali- 
bration is therefore required. This section describes the estimation of the b- and c-jet tagging efficiencies, 
both of which are dominated by genuine heavy flavor decays. Jets originating from u, d, or s quarks or from 
gluons are jointly referred to as light (/) jets. The light-jet tag rate estimation is described in Sec. [9] 

8.1. MC and data samples 

The data used in the performance measurements were collected from July 2002 to February 2006 and 
correspond to an integrated luminosity of w 1 fb _1 . The tagging efficiency in simulated events is measured 
using several processes simulated using the PYTHIA (Z —> bb, Z — > cc, Z — > qq, QCD) and ALPGEN (inclusive 
tt) event generators. Large samples of simulated b and c jets are created by combining the appropriate flavor 
jets from the individual samples. 

8.2. The SystemD method 

The SystemD method [26j was developed to determine identification efficiencies using almost exclusively 
the data. Simulated samples are used only to estimate correction factors. The method involves several, 
essentially uncorrelated, identification criteria which are applied to the same data sample. Combining these 
criteria allows the definition of a system of equations which can be solved to extract the efficiency of each 
criterion. 

The data sample is assumed to be composed of a signal and n backgrounds. Denoting by f$ the fraction 
of signal events, and by /i=i... n the fraction of each considered background, these fractions must satisfy: 



Subsequently, m uncorrelated identification criteria are considered with different selection efficiencies ef=o "'™ 
for the signal and backgrounds. Only a fraction Q k of the total number of events will pass the fc-th 
identification criterion. Then a new set of equations can be added for each selection: 
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Figure 20: The NN output for light-flavor (dashed line) and b (continuous line) jets (with pt > 15 GeV and |tj| < 2.5) in 
simulated QCD events. Both distributions are normalized to unity. 
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Figure 21: Performance of the NN (up triangles, continuous line), JLIP (down triangles, dotted line), CSIP (hollow crosses) 
and Loose SVT (hollow stars) taggers computed on simulated Z — > bb and Z — » qq jet samples. Due to the discrete nature of 
the CSIP and SVT taggers' outputs a continuous performance curve is not shown. 



2G 



If the selection criteria are uncorrelated, the total efficiency e^b ( 
successively can be factorized in terms of the individual efficiencies e\ : 
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A generalization of Eq. (|10p can then be obtained for a combination of several criteria: 
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The signal and background fractions arc n + 1 unknown parameters and each identification criterion intro- 
duces n + 1 new unknowns in the form of selection efficiencies. The number of equations of the form of 
Eq. flT2"j) depends on the number of combinations of the m criteria, leading to a total of J2T=o (m) = 
equations. To obtain a system of equations which can be solved, n and m must satisfy 



The simplest non-trivial solutions are 

• m = 3, n = 1: 8 equations with 8 unknowns; 

• m — 4, n = 2: 16 equations with 15 unknowns. 

These systems of equations are nonlinear and have several solutions. Only the simplest case of 8 equations will 
be considered in the following. This system has two solutions, which differ by the interchange of efficiencies 
assigned to the signal and background samples. As will be detailed in Sec. 18.3.31 further a priori knowledge 
of at least one of the unknown parameters is required to resolve the ambiguity. The input parameters are 
the fractions of events, Q^'"' r , which are determined directly from the data. There is therefore no input 
from simulated events. Solving the system gives access to the signal and background fractions and to the 
various efficiencies. 

8.3. Application to b-tagging efficiency measurements 

The SystemD method is used here in order to extract the ^-tagging efficiencies of the NN tagger. The 
method is applied to a data jet sample that is expected to have a significantly higher heavy flavor content 
than generic QCD events, but is not biased by lifetime requirements and provides large statistics. In detail, 
taggable jets are selected based on the following additional criteria: 

• the jet must have pr > 15 GeV and \n\ < 2.5; 

• the jet must contain a muon with p 1 ^ > 4 GeV, within a distance /S.1Z = 0.5 from the jet axis. 

The lifetime composition of the resulting sample could be biased by trigger requirements applying impact 
parameter or secondary vertex requirements. To avoid such biases, events are required to have passed at 
least one lifetime-unbiased trigger. These requirements result in a sample of 141 x 10 6 events. 

This sample contains a mixture of b, c, and light-flavor jets. The first two are mostly due to semimuonic 
decays of b and c hadrons; muons in light-flavor jets arise mainly from in-flight decays of ir^ and mesons. 
To apply the SystemD method as described above, with m = 3 criteria and n — 1 backgrounds, however, 
only a single source of background can be dealt with. The c and light-flavor backgrounds are therefore 
combined and denoted as "cl" . An important consequence of this combination is the fact that the use of 
the SystemD method only allows the efficiency to be determined for a specific mixture of c and light-flavor 
jets; it is therefore not useful to extract efficiencies for the separate background sources. 



(1 + m) x (1 + n) < 2 



(13) 
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8.3.1. SystemD selection criteria 

The three identification criteria used are: 

1. The NN tagger operating point under study. 

2. A requirement on the transverse momentum of the muon relative to the direction obtained by adding 
the muon and jet momenta, This criterion is chosen because the high p™ 1 values in 6-hadron decays 
are due to the high mass of the b quark, and as such are in principle expected to be independent of 
the lifetime criterion. 

3. The requirement that the event contain another jet satisfying "Pjlip < 0.005, referred to below as 
away-side tag. As b quarks are usually produced in pairs, this selection criterion allows an increase in 
the fraction of b jets without being applied directly to the muon jet itself, and hence no correlation 
with the two other criteria is a priori expected. 



8.3.2. Correction factors 

In practice, correlations between these selection criteria are not altogether absent, and they need to 
be accounted for. As insufficient information is available in the data to estimate these correlations, they 
are evaluated on the simulated samples instead. Even though this approach entails some dependence on 
MC, the dependence can be expressed in terms of correction factors, as detailed below, and the efficiencies 
themselves are determined on the data only. 

The first correlation studied is that between the NN tagger and the p™ 1 requirement. The p T rf 1 distribution 
of the data sample is shown in Fig.[52]for all taggable jets. The data are fitted as the sum of MC distributions 
for each quark flavor, with a free normalization for each flavor. Reasonable agreement is obtained. Although 
such a fitting procedure could in principle be used to estimate 6-jet tagging efficiencies, it relies on more 
assumptions than the SystemD method and is therefore not used for this purpose. 
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Figure 22: p™' distribution of muons in taggable jets. The superimposed histograms represent the contribution of simulated 
b — > fiX (dashed), c — > /iX (dotted), light quark jets (dash-dotted), and their sum (solid histogram). The template fit employed 
the method described in Ref. I27I1. 



The requirement > 0.5 GeV is chosen in order to have a similar efficiency for c- and light-quark 
jets, so that the application of this requirement affects the flavor composition of the background jets only 
modestly. Correction factors Kb and k c i are determined for signal and background jets by dividing the 
efficiency for jets satisfying both criteria by the product of the efficiencies for jets satisfying the individual 
criteria. They are shown in Figs. [23] and [22] for the L2 and Tight operating points, respectively. 

As indicated above, the application of the away side tag is not a priori expected to be correlated with 
the other two criteria. This hypothesis has been verified explicitly in the simulation for the correlation with 
the p™ 1 requirement. The lifetime tagging requirements applied to both jets, however, could be correlated 
by the fact that they involve the same primary vertex. The corresponding correction factors are evaluated 
in the same way as Kb and k c i. They are denoted /3 for b jets and a for background jets. 

In addition to increasing the 6-jet fraction of the data sample, the application of the away-side tag 
modifies the flavor composition of the background sample, as the charm tagging efficiency is expected to be 
significantly higher than that for light-flavor jets. This causes a dependence of a on the physics assumptions 
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respectively. The fit uncertainties are too small to be visible in this figure. 



made in the MC event generators (in which charm and light-flavor QCD samples are added weighted by 
their respective production cross sections to yield a sample of cl jets). Fortunately, it turns out that the 
uncertainty on a affects the fe-tagging efficiency only marginally. The factors a and (3 are shown in Figs. [25] 
and [SB] respectively, for the L2 and Tight operating points. 

8.3.3. SystemD equations 

Denoting the criteria used in SystemD as I for the lifetime tagging criterion, m for the pJj? 1 requirement, 
and b for the away-side tag, with the notation for the correction factors as above, and with fo and f\ of 
Eq. [9] renamed to fb and f c i , respectively, the final system to solve is therefore: 



fb 


+ 


fcl 


= 1 




+ 




- Q l 




+ 




= Q m 




+ 




= Q b 


fbKbeief 


+ 


f t~ <~l a-TCl 


= Q l < m 




+ 


Jcl £ cl £ cl 


= Q m ' b 




+ 


fciae b cl e l cl 


= Q b < 1 




+ 


f cl K cl ae l cl s^e b cl 


= Q l < m - b 



(14) 



As already mentioned in Sec. 18.21 the SystemD method leads to a set of nonlinear equations, and two 
possible solutions exist for the quantity of interest, the 6-tagging efficiency. The ambiguity between these 
two solutions is resolved using the a priori knowledge that the efficiency of each selection criterion for b jets 
should be higher than for background jets: £^ m,b > e^ m,b . 
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Figure 24: (a): the efficiency of the p^?' requirement (circles), the L2 NN tagger (squares), the AND of the NN tag and p^? 
requirement (up triangles), and the correlation factor k c ; (down triangles and fit), measured in the cl — > fiX MC sample in the 
jet pt projection, (b): same in the projection, (c) and (d): same for the Tight NN tagger in the jet px and [77I projection, 
respectively. The dotted lines represent the fit uncertainties. 
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Figure 25: (a): the L2 NN tagging efficiency (circles), the tagging efficiency after an away-tag requirement (squares), and their 
ratio, a (up triangles and fit) in the cl — > /iX MC sample as a function of jet px- (b): same as a function of (c) and (d): 
same for the Tight NN tagger as a function of jet px and \rj\, respectively. The fit uncertainty on a is represented by the dotted 
lines. 
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Figure 26: (a): the L2 NN tagging efficiency (circles), the tagging efficiency after an away-tag requirement (squares), and their 
ratio, j3 (up triangles and fit) in the b —> /iX MC sample as a function of jet py. (b): same as a function of (c) and (d): 
same for the Tight NN tagger as a function of jet pt and respectively. The fit uncertainty on j3 is too small to be visible 
in this figure. 



8.4. Further corrections 

The ^-tagging efficiency obtained with the SystemD method is valid for jets with a semimuonic decay of 
the b hadron. To obtain the efficiency for inclusive jets not biased by the requirement of such a decay, a 
correction is determined using a sample of simulated b jets with b hadrons decaying inclusively or as b — > /j,X. 
The final efficiency is then defined as 

r data MC 

,-data _ b b^nX ' fc b ctt 1 JAG /-1 

£ b mc -i>t b -e b , [Lb) 

where SF& = e b ^ x / 'e^^x ^ s the data-to-simulation efficiency scale factor, and s b ^ x is identical to the 
quantity denoted as e\ in Eq. 1141 The tagging efficiency for c-quark jets is not measured in data. It is 
assumed that the data-to-simulation scale factor is identical for b and c jets. The c-jet tagging efficiency is 
then derived from the simulation as 

4 ata = SF b - e MC (16) 

8.5. Tagging efficiency parametrization 

The tagging efficiencies are parametrized in terms of the px and r\ of the jets. As the use of the 
SystemD method requires high statistics to obtain stable solutions, it is not possible to extract a proper 2D 
parametrization. Instead, it is assumed that the dependence on these variables can be factorized: 

e(PT, v) = £ (Pt) -e(M), (17) 



where e a n is the efficiency for the entire sample, and 

Pt ' 



e( PT ) = (18) 

y ' 1 + ae~ b P T 
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where a - h are fit parameters. 

The data 6-jet NN tagging efficiency calculated using the SystemD method is shown in Fig. [571 along 
with the simulated semimuonic 6-jet efficiency and SF&. The inclusive 6-jet efficiencies, measured in data 
and in simulated events are shown in Fig. 1281 The corresponding plots for c jets are shown in Fig. 1291 




Figure 27: (a): the L2 NN tagger scale factors (line) and the data (squares) and MC (circles) semimuonic b-jet efficiencies as 
a function of jet py. (b): same as a function of jet \tj\. (c) and (d): same for the Tight NN tagger, as a function of jet px and 
I77I, respectively. The functions used for the parametrization are outlined in the text and the dotted curves represent the ±1<t 
statistical uncertainty. 



8.6. Systematic uncertainties 

Uncertainties on the resulting efficiencies arise from the following sources: the SystemD calculations 
(due to uncertainties on the correction factors as well as limited data statistics), the dependence of the 
efficiencies on the simulated samples, and possible imperfections in their chosen parametrization including 
the assumption of factorization in Eq. If 71 These uncertainties are discussed below. 

8.6.1. SystemD uncertainties 

The correction factors a, j3, Kb and k c i (Sec. 18.3]) are evaluated using simulated events, and have non-zero 
statistical uncertainties. The effect of these is evaluated by repeating the SystemD computations with the 
parametrization of each individual factor shifted by its statistical uncertainty, while all other correction 
factors are fixed to their nominal values; the resulting changes in the computed efficiency are interpreted as 
systematic uncertainties. The effect of the choice of minimum p™ 1 requirement in the SystemD calculations is 
evaluated by varying it between 0.3 GeV and 0.7 GeV. The total relative systematic uncertainty associated 
with the SystemD correction factors is estimated by adding the individual contributions in quadrature, and 
varies between 1.3% and 1.7% for the different operating points. As an illustration, the results of this 
procedure when solving SystemD for the entire sample are summarized in Table [3] for the NN L2 and Tight 
operating points. 

For each bin in 77 and pr, the SystemD systematic uncertainty for that bin is added in quadrature with 
the statistical uncertainty resulting from the SystemD fit. This yields an overall uncertainty, referred to as 
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Figure 28: (a): the L2 NN tagger inclusive 6-jet efficiency in both data (line) and MC (circles) as a function of jet px- (b): 
same as a function of jet \rj\. (c) and (d): same for the Tight NN tagger, as a function of jet px and \rj\, respectively. The dotted 
lines represent the fit uncertainty, which is almost entirely inherited from the uncertainty on the scale factor. The functions 
used for the parametrization are outlined in the text. 




Figure 29: (a): the L2 NN tagger inclusive c-jet efficiency in both data (line) and MC (circles) as a function of jet px- (b): 
same as a function of jet \r]\. (c) and (d): same for the Tight NN tagger, as a function of jet px and |tj|, respectively. The dotted 
lines represent the fit uncertainty, which is almost entirely inherited from the uncertainty on the scale factor. The functions 
used for the parametrization are outlined in the text. 



33 



"statistical uncertainty" below, with which the efficiency is known for each bin, and which is used in the 
fitting of the parametrized curves in px and \r)\. 

The relative combined statistical and systematic uncertainties as a function of px and |?7| are calculated 
by evaluating 



As+(p T ,\v\) 



,+ la 



(pt) 



,+ la 



(M) £(pt) • e(M) 



, + la 
-all 



Call 



(19) 



where the +la quantities are the fluctuations upward by one standard deviation of the quantities introduced 
in Eq. [T7] This is also repeated with the downward fluctuations, and the larger deviation is assigned as the 
uncertainty. 





L2 


Tight 


Efficiency 


65.9% 


47.6% 


a 


0.0% 


0.0% 


P 


0.2% 


0.6% 


K b 


0.7% 


1.2% 


Kcl 


0.3% 


0.2% 




1.0% 


0.7% 


SystemD Total 


1.3% 


1.5% 



Table 3: NN tagger efficiencies for the complete data sample, and relative systematic uncertainties originating from the SystemD 
method. The total systematic uncertainty is determined by adding the individual uncertainties in quadrature. 



8.6.2. Efficiency parametrization and sample dependence uncertainty 

Both the parametrization and MC sample dependence systematic uncertainties, which result from the 
use of efficiencies derived from generic combined samples of simulated b, c, and muonic b jets, are quantified 
in one measurement. This is done by comparing the relative difference between the actual and predicted 
numbers of tags in various bins in px and 77, and for each of the simulated samples used to construct the 
efficiencies. This effectively constitutes a closure test, and a total uncertainty is determined from the spread 
of the relative differences. 

In detail, the relative differences are calculated as a function of px in three \rj\ ranges denoted CC 
(|?7| < 1), ICR (1 < I77I < 1.8), and EC (|?7| > 1.8). The relative differences are histogrammed weighted by 
the number of actual tags in the region. The RMS widths of the resulting distributions are used to quantify 
the total uncertainty on each of the efficiencies. The relative uncertainty determined by this method ranges 
from 1.2% for the loosest operating point to 3.5% for the tightest operating point for the inclusive 6-jet 
efficiency, and from 2.4% to 4.0% for the inclusive c-jet efficiency. 

8.6.3. Total systematic uncertainty 

Total systematic uncertainties are assigned to all bins of px and n for Sb, £ c , and SF^. They are calculated 
as detailed below and are shown in Table |4] for the L2 and Tight NN operating points. 

SFf,: The closure test uncertainty for the b — >■ piX efficiency. 

£;,: The SF systematic uncertainty added in quadrature with the closure test uncertainty for the 6-jet 
efficiency. 

e c : The SF systematic uncertainty added in quadrature with the closure test uncertainty for the c-jet 
efficiency. 

The systematic uncertainties for Eb range from ±1.9% to ±4.8%, for e c from ±2.8% to ±5.2%, and for 
SFf, from ±1.4% to ±3.4% for the loosest to tightest working points. 
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Uncertainty 


L2 


Tight 


MC b -> nX 


2.4% 


3.5% 


MC b 


1.8% 


2.8% 


MC c 


2.9% 


3.9% 


SF b 


2.4% 


3.5% 


£b 


3.0% 


4.5% 


£ c 


3.8% 


5.2% 



Table 4: Total relative systematic uncertainties on the MC sample parametrizations, and their effect on £5, e c , and SF5. 

The total uncertainties are computed by adding in quadrature the systematic and the statistical uncer- 
tainties (which include the SystemD systematic uncertainties, as detailed in Sec. I8.6.T]) . They are shown for 
SFf,, Eft, and e c in Fig. 1301 for the L2 and Tight operating points. The relative uncertainty increases rapidly 
at high r\ due to limited statistics in that region and because the value of the scale factor drops rapidly for 
|fj|>2. 

9. Fake Rate Determination 

The determination of the light-flavor mistag rate (where "light" stands for uiis-quark or gluon jets) or fake 
rate relies on the notion that in the absence of long-lived particles such as V°s (see Sec. 13. 3p . reconstructed 
high-impact parameter tracks or displaced vertices reconstructed in light-flavor jets result from resolution 
and misreconstruction effects. These effects are expected to lead to tracks with negative impact parameters 
(see Sec.[S]for the impact parameter sign convention used) and displaced vertices with negative decay lengths 
as often as to positive impact parameters and decay lengths. Barring incorrectly assigned negative impact 
parameter signs (which may occur whenever a jet and a track are nearly aligned in azimuth, and which is 
important for long-lived particles), using such negative impact parameter tracks and negative decay length 
vertices should provide a reasonable estimate of the fake rate. 

9.1. Data sample 

To minimize the impact of incorrectly attributed impact parameter signs, the fake rate is determined in 
an inclusive jet sample with low heavy flavor content. Two samples are used for this purpose: 

• A sample consisting of events selected by requiring at least one electron candidate with pr- > 4 GeV 
and with low missing transverse energy, fx < 10 GeV, and referred to below as the EM sample. As in 
Sec. 18.31 at least one trigger without lifetime bias is required. Most of the electron candidates are jets 
that deposit a large fraction of their energy in the EM section of the calorimeter. This may reduce the 
heavy flavor content of the sample, as the fraction of a jet's energy deposited through electromagnetic 
processes depends on the jet flavor. This bias is removed by only considering jets whose distance to 
the nearest identified EM cluster is ATZ > 0.4. After these requirements, the EM sample contains 106 
million taggable jets. 

• An inclusive jet sample (referred to in the following as the QCD sample) consisting of all events 
collected using jet triggers. It contains 154 million taggable jets. For consistency, jets in the vicinity of 
identified EM clusters are not considered in this sample either. Since trigger requirements should not 
bias such jets in this sample, the effect of this removal should be small and will be evaluated below. 

These two samples are combined (referred to as the COMB sample) for most studies; their comparison 
allows a systematic uncertainty associated with the choice of a particular sample to be estimated. The pr 
and 77 distributions in these samples, after the taggability requirement, are compared in Figure 1311 The 
main difference is an increased high-p^ content in the QCD sample due to trigger effects. 
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Figure 30: The total relative uncertainty (combined systematic and statistical) for the Scale Factor (SFj,) (a, b), (c, d) and 
e c (e, f) as a function of Pt (a, c, e) when r\ = 1.2, and r\ (b, d, f) when pj< = 45 GeV. 
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Figure 31: Jet r\ (a) and pj< (b) distributions in the QCD and EM data samples. 



9.2. Negative tag rate 

The use of negative impact parameter tracks and negative decay length displaced vertices is rather 
straightforward: the algorithms providing the NN input variables listed in Sec. !7.1.T1 need only minor modi- 
fications in order to provide "negative" equivalents of these variables, called Negative Tag (NT) results. The 
NN output is then recomputed using the NT values rather than the original ones, and the fraction of jets 
tagged in this modified fashion represents the negative tag rate. The NN input NT results are computed as 
follows: 

CSIP: The TVcsip variable is recalculated using tracks with negative instead of positive impact parameter 
significance to obtain the "strong classifier" numbers of tracks Ns s and N2 S (see SecE]). 

JLIP: The Jet Lifetime Probability T'jlip is recomputed using only tracks with negative rather than positive 
impact parameter significance. 

SVT: In this case, no additional computation is necessary. Instead of the highest (positive) decay length 
significance, the most negative decay length significance displaced vertices (for both the SuperLoose 
and Loose algorithm versions discussed in Sec. 2|) are used to supply the SVT-related NN variables. 

Like the 6-jet efficiency, the fake rate and negative tag rate are parametrized as functions of a jet's 
kinematical (pr, \r]\) variables. However, in contrast to the efficiency, the data shows that the dependence of 
the NT rate on jet pr and cannot be factorized into a dependence on pt multiplied by a dependence on 
\r)\. Instead, it is parametrized as a function of jet pr in three regions: < |f?| < 1.0 (CC), 1.0 < \r\\ < 1.8 
(ICR), and 1.8 < \r)\ < 2.5 (EC). In each region, the pt dependence is parametrized using a quadratic 
polynomial. The negative tag rate parametrizations in the three |?7| regions are shown in Fig. [32] for the L2 
and Tight operating points. 
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Figure 32: The NT rate paramctrization for the COMB sample in the CC (circles), ICR (squares), and EC (triangles) for 
the L2 (a) and Tight (b) operating points. The negative tag rate is parametrized with a second order polynomial. The fit 
uncertainty is too small to be visible in this figure. 
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9.3. Corrections 

The NT rate is not a perfect approximation of the fake rate. Corrections for the following effects are 
applied: 

• The presence of heavy flavor jets increases the NT rate, primarily due to tracks that originate from 
the decay of long-lived particles and are (mistakenly) assigned a negative impact parameter sign. As 
no method is available to estimate this effect on data, QCD events simulated using the Pythia event 
generator are used instead. This results in a correction factor Fm = £qcd iight/ e QCD all' *- e -> ^ ne ra tio 
of NT rates with and without the presence of heavy flavor jets in these simulated events. 

• The V° removal algorithm (see Sec. !3-3[) is not fully efficient, so that some contribution from long- 
lived particles and photon conversions remains. Most of the resulting tracks will correctly be assigned 
positive impact parameters, and the NT rate is affected less by their presence than the fake rate. This 
effect is estimated using the same simulated QCD events as for the F^f factor, leading to a correction 
factor F\f — £qq D ii g ht/ e QCD light' *- e -' the ratio of the positive- and negative-tag rates in the simulated 
light-flavor events. 

Finally, the fake rate £ii g ht is estimated as the NT rate measured in data, e^J a , corrected for the above 
effects: 

Slight = £ d ata ' Fhf ■ Flf. (20) 

The jet px dependences of and F\{ are shown in Fig. 1331 for the L2 and Tight operating points. The 
estimated light quark tagging efficiencies en g ht for the L2 and Tight operating points are shown in Fig. [M] 
Both the negative and positive tag rates for light-flavor jets increase with increasing jet pt, for two reasons: 
(i) the multiplicity of long-lived particles and their average decay length increase; and (ii) jets become 
more collimated, with the resulting higher hit density leading to a larger number of wrongly reconstructed 
high-impact parameter tracks. 

9.4- Systematic uncertainties 

The use of a particular sample (in this case, the combination of the QCD and EM multijet samples) to 
provide a "universal" estimate of the NT rate needs to be validated. To this end, the ratio of the NT rates 
as measured in the separate QCD and EM samples is determined as a function of the kinematical variables 
and is shown in Fig. [35] for the L2 and Tight operating points. 

A corresponding systematic uncertainty is calculated from a constant fit to the EM/QCD NT rate ratio. 
Half the difference between the fit value and unity is taken as the systematic uncertainty, or if the ratio is 
consistent with unity within the fit uncertainty scaled by \J x 2 /Ndof, this scaled fit uncertainty is taken as 
the uncertainty. The relative uncertainty ranges from 0.2% to 1.3% in the CC, 0.3% to 0.7% in the ICR 
and 1.2% to 3.1% in the EC from the loosest to tightest operating points. 

In addition, the effect of removing jets in the vicinity of EM clusters in the QCD sample needs to be 
taken into account. The NT rate in the QCD sample with the jets removed is slightly lower than in the 
full QCD sample. The effect is small, ranging from 0.2% for the loosest to 1% for the tightest operating 
point, and does not depend on jet px- A systematic uncertainty is assigned in the same way as that for the 
difference between the EM and QCD samples, and ranges from 0.1% to 0.6% in the CC, 0.1% to 0.3% in 
the ICR, and 0.1% to 0.7% in the EC from the loosest to tightest operating points. 

To test the parametrization of the NT rate in the three \n\ regions a comparison is made between 
the number of tags found by the tagger and its prediction from the parametrized NT rate. A systematic 
uncertainty is again calculated from a constant fit to the ratio of the actual and predicted number of tags, 
following the same procedure as the EM/QCD sample comparison. The systematic uncertainty ranges from 
0.1% to 0.3% in the CC/ICR and from 0.1% to 0.7% in the EC from the loosest to tightest operating points. 

The -Fhf correction factor depends on the assumed b- and c-fractions in the multijet data sample. In turn, 
these depend on the cross sections for QCD heavy flavor production. To estimate the uncertainty on i<hf, 
the fraction of b (c) jets is varied from its default value of 2.6% (4.6%) by 50% (relative). Given that the 
individual b- and c-production mechanisms are very similar, these fractions are varied coherently. The total 
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Figure 33: The light jet asymmetry correction, F\f (circles), heavy flavor correction, F^f (squares), and total negative tag 
correction (triangles) in the CC (a, b), ICR (c, d) and EC regions (e, f) for the L2 (a, c, e) and Tight (b, d, f) operating points. 
The solid and dotted lines indicate the fit of the total correction factor and its uncertainty, respectively. 




Figure 34: The estimated light quark tagging efficiency parametrized in the CC (continuous line), ICR (dot-dashed line) and 
EC (dot-dot-dashed line) for the L2 (a) and Tight (b) operating points. The dotted black lines represent the fit uncertainty. 
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Figure 35: The ratio of the EM and QCD negative tag rates for the L2 (a) and Tight (b) operating points in the CC (circles, 
continuous line), ICR (squares, dot-dashed line), and EC regions (triangles, dot-dot-dashed line). 



uncertainties from varying the fraction of b (c) jets ranges from 2.8% (1.6%) for the loosest to 19.5% (4.7%) 
for the tightest operating point in the CC/ICR and from 1.0% (0.7%) to 6.7% (3.5%) in the EC regions. 

The total uncertainty on the fake tag rate is given by adding in quadrature the systematic contributions 
(as discussed above) for the appropriate region to the statistical uncertainty, estimated as the difference 
between the fake tag rate central value and one standard deviation fit curves. The dominant contribution 
is the systematic one. The combined relative systematic uncertainty ranges from 5.9% (for the loosest 
operating point) to 23.1% (for the tightest operating point) in the CC region, from 4.8% to 24.2% in the 
ICR region, and from 2.2% to 10.0% in the EC region. A more detailed breakdown of the systematic 
uncertainties is shown in Tables [5] and [6] for the L2 and Tight operating points, respectively. 



Region 


CC ICR EC 


Parametrization 
EM/QCD 
EM veto 
c fraction 
b fraction 


0.1% 0.1% 0.1% 
0.5% 0.5% 1.6% 
0.2% 0.1% 0.1% 
3.7% 3.3% 1.3% 
7.3% 6.4% 2.2% 


Total 


11.0% 9.7% 3.8% 



Table 5: Fake tag rate relative systematic uncertainties for the L2 NN operating point. 



Region 


CC ICR EC 


Parametrization 
EM/QCD 
EM veto 
c fraction 
b fraction 


0.1% 0.2% 0.4% 
1.0% 0.5% 1.6% 
0.5% 0.2% 0.5% 
4.5% 4.5% 1.4% 
14.3% 13.8% 4.2% 


Total 


18.8% 18.3% 5.9% 



Table 6: Fake tag rate relative systematic uncertainties for the Tight NN operating point. 



10. Summary and Conclusion 

Several techniques to identify b jets exploiting the long lifetime of b hadrons have been discussed in this 
article. Compared to the use of individual 6-jet tagging algorithms, the combination of their results in an 
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Fake Rate (%) 

Figure 36: Data performance profile of the NN tagger as applicable to the kinematics of Z — > bb and Z — > qq events for all jets 
(dashed line) and for jets with \r)\ < 1.1 and pt > 30 GeV (solid line). The vertical error bars on the plot represent the total 
uncertainty (statistical and systematic uncertainties added in quadrature) on the performance measurements. 

artificial neural network leads to a considerable improvement in performance. 

This performance needs to be calibrated using the actual collider data, as the simulation cannot be 
expected to reproduce the performance of the detector in all aspects relevant to 6-jet tagging. The calibration 
methods described employ QCD jet samples, and therefore can make use of ample statistics at low jet pt- 

• Starting from the further requirement of a muon-jet association, the SystemD method allows the 
determination of the 6-jet tagging efficiency, with minimal input from simulation, even in the presence 
of an a priori unknown background. 

• The determination of the light-flavor mistag rate makes use of the fact that without such a muon 
requirement, the sample consists almost entirely of light-flavor jets. This method is limited by the 
knowledge on the remaining heavy-flavor content. 

The resulting performance as measured using data, including full statistical and systematic uncertainties, is 
shown in Fig. [3Hfor all jets and for jets with \rj\ < 1.1 and pr > 30 GeV. 

These tagging algorithms and calibration methods have been used in many publications of DO Run Ha 
analyses. They are being refined to make use of a new layer of silicon sensors installed at even smaller 
distance from the beam line [28j], as well as cope with the higher instantaneous luminosities common in the 
Run lib data taking period that started in June 2006. The calibration methods are not specific to DO and 
could be used at other experiments. 
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